The META-SHARE Language Resources Sharing Infrastructure: Principles, Challenges, Solutions
نویسنده
چکیده
Language resources have become a key factor in the development cycle of language technology. The current prevailing methodologies, the sheer number of languages and the vast volumes of digital content together with the wide palette of useful content processing applications, render new models for managing the underlying language resources indispensable. This paper presents META-SHARE, an open resource exchange infrastructure, which aims to boost visibility, documentation, identification, openness and sharing, collaboration, preservation and interoperability of language data and basic language processing tools. META-SHARE is implemented as a network of distributed repositories of language resources. It offers providers and consumers of resources the necessary functionalities for describing, storing, searching, licensing and downloading language resources in a single integrated technical platform. META-SHARE favours and aligns itself with the growing open data and open source tools movement. To this end, it has prepared the necessary underlying legal framework consisting of a Charter for language resource sharing, as well as a set of licensing templates aiming to act as recommended licence models in an attempt to facilitate the legal interoperability of language resources. In its current version, META-SHARE features 13 resource repositories, with over 1200 resource packages.
منابع مشابه
A Data Sharing and Annotation Service Infrastructure
This paper reports on and demonstrates META-SHARE/QT21, a prototype implementation of a data sharing and annotation service platform, which was based on the META-SHARE infrastructure. META-SHARE, which has been designed for sharing datasets and tools, is enhanced with a processing layer for annotating textual content with appropriate NLP services that are documented with the appropriate metadat...
متن کاملThe META-SHARE Metadata Schema for the Description of Language Resources
This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing. It places the model in the overall framework of metadata models, describes the basic principles and features of the model, elaborates on the distinction between minimal and maxim...
متن کاملOne Ontology to Bind Them All: The META-SHARE OWL Ontology for the Interoperability of Linguistic Datasets on the Web
META-SHARE is an infrastructure for sharing Language Resources (LRs) where significant effort has been made into providing carefully curated metadata about LRs. However, in the face of the flood of data that is used in computational linguistics, a manual approach cannot suffice. We present the development of the META-SHARE ontology, which transforms the metadata schema used by META-SHARE into a...
متن کاملMETA-SHARE: One year after
This paper presents META-SHARE (www.meta-share.eu), an open language resource infrastructure, and its usage since its Europe-wide deployment in early 2013. META-SHARE is a network of repositories that store language resources (data, tools and processing services) documented with high-quality metadata, aggregated in central inventories allowing for uniform search and access. META-SHARE was devel...
متن کاملMETA-SHARE v2: An Open Network of Repositories for Language Resources including Data and Tools
We describe META-SHARE which aims at providing an open, distributed, secure, and interoperable infrastructure for the exchange of language resources, including both data and tools. The application has been designed and is developed as part of the T4ME Network of Excellence. We explain the underlying motivation for such a distributed repository for metadata storage and give a detailed overview o...
متن کامل